Abstract: Recent years have observed the
ability to gather a massive amount of data in a large number of domains. As the
data is collected in unprecedented rate, the analysis, rather than the storage
of this data becomes a challenge.
According to the IDC estimation 90% of data is unstructured data which is a
fastest growing data whereas the remaining is the structured data, unstructured
data refers to information that either does not have predefined data model or
does not fit into relational database for information access. This unstructured
data are being continuously comes from various sources like satellite images,
sensor readings, email messages, social media, web logs, survey results, audio,
videos etc. Due to the large volume of unstructured data there is a big
challenge for all the industry currently to analyse and extract a meaningful
value from it. Traditional methods are adequate for analysis of structured data
but these methods are not appropriate for large volume of unstructured data in
order to extract knowledge.
This paper presents the summary about unstructured
data analysis for the beginners or the people from academia who is interested
in analysis of unstructured data to extract the knowledge to improve the
business processes and performance.
Keywords: Unstructured data, structured data, data mining